GPGPU Performance Estimation with Core and Memory Frequency Scaling

نویسندگان

  • Qiang Wang
  • Xiaowen Chu
چکیده

Graphics Processing Units (GPUs) support dynamic voltage and frequency scaling (DVFS) in order to balance computational performance and energy consumption. However there still lacks simple and accurate performance estimation of a given GPU kernel under different frequency settings on real hardware, which is important to decide best frequency configuration for energy saving. This paper reveals a fine-grained model to estimate the execution time of GPU kernels with both core and memory frequency scaling. Over a 2.5x range of both core and memory frequencies among 12 GPU kernels, our model achieves accurate results (within 3.5%) on real hardware. Compared with the cycle-level simulators, our model only needs some simple micro-benchmark to extract a set of hardware parameters and performance counters of the kernels to produce this high accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance and Power-Aware Classification for Frequency Scaling of GPGPU Applications

The increased adoption of Graphics Processing Units (GPUs) to accelerate modern computational intensive applications, together with the strict power and energy constraints of many computing systems, has pushed for the development of efficient procedures to exploit dynamic voltage and frequency scaling (DVFS) techniques in GPUs. Although previous works have applied several pattern recognition te...

متن کامل

Cache Power Budgeting for Performance

Power is arguably the critical resource in computer system design today. In this work, we focus on maximizing performance of a chip multiprocessor (CMP) system, for a given power budget, by developing techniques to budget power between processor cores and caches. Dynamic cache configuration can reduce cache capacity and associativity, thereby freeing up chip power, but may increase the miss rat...

متن کامل

Evaluating Scalability of Multi-threaded Applications on a Many-core Platform

Multicore processors have been effective in scaling application performance by dividing computation among multiple threads running in parallel. However, application performance does not necessarily improve as more cores are added. Application performance can be limited due to multiple bottlenecks including contention for shared resources such as caches and memory. In this paper, we perform a sc...

متن کامل

Processor-Memory Power Shifting for Multi-Core Systems

Maximum power consumption is an important consideration in server design, as the total power envelope affects cooling costs and can limit performance. One approach to limiting total power is power shifting, managing power budgets among system sub-components to meet an overall total constraint. In this paper, we investigate processor-memory power shifting on a multi-threaded, 32-core commercial ...

متن کامل

Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)

General-purpose graphics processing units (GPGPU) have emerged as an important class of shared memory parallel processing architectures, with widespread deployment in every computer class from high-end supercomputers to embedded mobile platforms. Relative to more traditional multicore systems of today, GPGPUs have distinctly higher degrees of hardware multithreading (hundreds of hardware thread...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1701.05308  شماره 

صفحات  -

تاریخ انتشار 2017